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ABSTRACT / .; 

Part and bi-partial canonical correla'l^ons were 
developed by extending the definitions of part and bi-pkrtial 
correlation to sits of variates. These coefficients may beNis^ed to 
help researchers I explore relationships which exist among several sets 
of normally distributed variates. (Author) 
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Part and Bipartial Canonical Correlation Analysis 
Neil H. Timm and James £• Carlson 
^ University of Pittsburgh 

!• Introduction 

The concept of simple correlation was introduced into statistics 
by Sir Francis Galton in several papers published during the 1880 
However, h^s ideas on correlation were generally unknown until his 
book, Natural Inheritance, was published in 1889 • Galton 's work 
stimulated Pearson (1896, 1898) to develop a precise mathematical 
theory of correlation which led to the development of partial and 
multiple correlation (Yule, 1897, 1907), It was not until 1926 that 
M. 'Ezekiel and B. B. Smith defJtied parti correlation and, although not 
explicitly, the notion of bipartial correlation (Ezekiel, 1941), 

Although multiple correlation coefficients enable us to investigate 
associations between one variate and a set of variates, simple, partial, 
part, and bipartial correlation coefficients are used as measures of 
association between two variates. Generalizing the notion of correlation 
between one variate and a set of variates to two sets of variates, 
Hotellit^g (1935, 1936) developed canonical correlation coefficients and 
canonical variates to investigate linear relationships between two sets 
of variates* However, it was Roy (1957, p. 26) and more recently Rao (1969) 
who generalized the concept of canonical correlation to partial canonical 
correlation which is no more than the canonical correlation between two 
sets of variates ^ and J|C after the effect of a third set of variates is 
removed • 
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By extending the definitions of part and bipartial correlation to 
sets of variatesy we develop. part /and bipartial canonical correlations 
and illustrate how these coefficients and their corresponding canonical 
variates may be used to explore relationships which exist among sets of 
normally distributed variates. ll 

2. Canonical Correlation Analysis 
Given a set of p ability variates ;][ and a set of q personality 
variates where - [ ^ ^ ] is notwally distributed with variance-' ^ 
covariance matrix Zp ^ 



Y 



p-fq 




^11 ^12 



^21 ^22 




a researcher may want to assess the degree of relationship between the 
two sets of variates ^ and ^. The method of canonical* correlation analysis, 
developed by Hotelling (1935, 1936) for this purpose, was to determine linear 
combinations of the original variates, U - jg';|r and V « ;^';jt, of unit variance 
such that the simple correlation between U and V was maximal. The mathe- 
matical procedure for accomplishing this is to maximize 

subject to the constraints that ^' ^^^j^ - ^'^22^ " ^' leads to the 

deternlnantal equation In ^ 

I ^12 ^2^' hi - ^11 I - ° 



1 
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(see for example, Timm, 1975, p. 349). The s « min (p»q) nonzero positive 

S(tuare roots Pi of the roots ^'i are called the canonical correlations 

ft 

between the canonical variates - and « Jjj^X, 1=1, 2, s. The 

I* 

ic*oef f icient vectors, f^g^ of the canonical variates Uj^ are the ei5?envectors 
o\ the det^rrainatital equation; tc obtain the coefficient vectors for 
each V^, the relationship 

... — i.i;2 s 

is used. The canonical variates within each set, and V^, are cli^arly 
uncorrelated and have unit variance. 




1^3 

Furthermore, the covariance between and U'^ is Pi for i=l, ...» s, and 
0, otherwise: 

cov (U^, V^) = Ci 1-1, s 

cov (U^, Vj) = 0 iTfj 

/ 

Investigating the canonical variates further, it is of interest to 
determine the correlation of each canonical variate for a set with the 
individual variates within the set. These correlations represent the 
contribution of each variate to the composite canonical variate and help 
in the interpretation of ^nonical variates. The correlations are given 
by the exprei,sions: 
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corr (Y, U^) = corr (Y, a'Y) 

= hi «i/°y . 
' corr (X. = l^^ ]^./o^^ 

Besides using correlations within a set to better understand canonical 
variates, we should also examine the relationships between canonical 
variates in one set and individual variates in the other. These become 



corr (;>[, U^) = corr ^[l) 



1 

C 

X 



1 



-fi corr (;(C, V.) 
corr (Y, V.) corr (^^ U.) 

To apply the theory developed above fcr a sample ]^2» •••» 

IN E), the population variance-covariance matrices are replaced by 
their sample counterparts S^^ . Alternatively, sample correlation matrices 
R.. may also be used since the roots of S^j^^^ ^22 ^21 ^11 ^ 

^^'^^^ \2 ^22^ ^21 identical. Although the vectors and Jj^ 

associated with the determinantal equation with S^^'s replacing ^^j's* will 
have units of measurement proportional to the original variates, the units 
of and nay not be meaningful. Canonical variates obtained by using 
correlation natrices have no units of measurewGnt and should be evaluated 
in terms of standardized variates • 

To test the null hypothesis that the p-variates are unrelated to the 
q -variates 

o 12 



B 



several multivariate criteria have been proposed. Bartlett (1938) outlines 
a procedure for testing the hypothesis when the sample sizes are large. 
He defines 

^2 
A = n (1-rp 

i=l 

2 2 
where r^ is th& sample estimator of P ^ and uses a chi-square approximation 

/ 

for the distribution of A. The hypothesis of independence is rejected if 
Xg = - [(N-1) - (p + q + l)/2] log A > xl (pq) 

2 

where (pq) is the upper a ^percentile of a chi-squared distribution with* 

pq degrees of freedom. 

If the null hypothesis of no relationship (or independence) can be 

rejected, the contribution of the first root of A may be 'removed and the 

significance of the remaining roots evaluated (see Bartlett, 1951, or Rao, 

1952, p. 370). In general, with s'<s = min (p, q) roots removed, x^e define 
s 

^ 2 
A = n (1-rp 

i=s'+l ^ 

Partitioning Bartlett's chi-square statistic, 

= - [(N-1) ^- (p + q + l)/2] log A* 

2 

we find that X_ has an approximate chi-square distribution with (p-s')(q-s') 
B 

degrees of freedom and may be used to test the significance of the roots 
s' -f 1 through s. The tests for significant canonical correlations, other 
than the first, are very conservative unless the correlations removed are 
close to 1 (see Williams, 1967). 
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An alternative to Bartlett's procedure has been developed by Roy (1953) 
and Is called Roy's largest root criterion. To test the significance of 
each root using Roy's procedure » the parameters 
s » min (p-i+1, q-i+1) 

I p-q Ul ' 

„ iL^ 

/ 

N - p - q -/2 

2 

are defined ' for Heck's (1960) chares and the characteristic roots r^ are 
compared to a critical value 0^(s,in,n), found ip the appropriate chart. 

The hypothesis of independence for two sets of random variates reduces 
to some familiar univariate results. If the number of variates in the p 

4 

set is one, the hypothesis reduces to 

»o- «12 "« ■ Pod q) " ° 

or ^ 

«r ^12*^ eo(l. 2. .... q) ° 

2 2 2 

and is tested using F « (R /q)/[(l-R )/(N-q-l)] where R is the square of 

thief sample multiple correlation coefficient. For p » q » 1, the hypothesis 

reduces to 



H : ? - 0 
o ^ 

H^: e 0 



and is test^ using t - r Jli-2 / Jl-r^ where r is the sample correlation coefficient 



(Fisher, 1915). 
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3. Partial Canonical Correlation Analysis 

Extending Hotelling^s development of canonical analysis to three sets 
of variates, Rao (1969) using some results given in Anderson (1958, p. 33) 
developed the notion of partial canonical correlation analysis which nifiy 
be used to assess. _the partial independence of two sets of variates given 
a third set of variates. 

Given a set of p variates ^, a set of q variates ^ and a set of r 
variates ^, where j^' i^f j^I is normally distributed, 

^21 ^22 ^23 

T T T. 

^31 ^32 33/ 

we may be interested in assessing the degree of the relationship between 

and ^ after removing the linear effect of the variates in the ^ set. 
That is> we vant- to find from the variates %y ^ ^ ^ ^ Itx ^ ^ ^ ^ where 

and are the res^^ual vectors (obtained from regressing JjC on jg and 
;^ on jg) linear combinati6n8, U » ^^'jg^ and V « ^'^x* ^"^^ variance such 
that the simple correlation between^ and V is maximal. Mathematically, 
this is eqtXivalent to maximizing ' 

^UV " ^12.3 

subject to the constraints ^11.3 ^ ' fe' ^22.3 " ^* matrices 
r.. ^ are the elements of the variance-covariance matrix of the residual 









\\ 


/ 






N ^ ^ I 
p+q+r J 








I 




^3 


I 



vectors 



% and ^ 



9 



.3 









L 

U.3 


\ 




^21.3 


^•22.3/ 




ll - 

'^11- - 


^13 ^3 
r r -1 


Si 


v^- ■ 


^3 ^3 


Si 



^12 ^13 ^33^ ^32 
^22 ^23 ^33 ^32, 



the variance-covariance matrix of the conditional distribution of ]jf and 
Maximizing F^, as shovm by Rao (1969), leads to the deterrainantal 



equation in V 



I ^12.3 ^22.3 ^21.3 ^ .3 ^il-3 



1 ' 2 

The s =■ min (p, q) nonzero positive square roots ^' ot the roots^^ . 



are called the partial canonical 

and V. - b* 



variates » ig[ jg^ 



correla! ns between the partial canonical 
^ - , i-1, s. The coefficient vectors 

of each are the eigenvectors of determinantal equation and the relation- 
ship bet\Jeen and is given by 



^22.3^ ^21.3 



ei.: 



i a 1, • • • , S 



To test the hypothesis of partial independence. 



H : E,^ . - 0 
o 12.3 



\\ S2.3 



using Roy's criterion,! the parameters are defined by 
s - min (p - i + 1, q - i + 1) 



m 



IP - q 



- 1 



N 



2 

r - P 
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and the r^^^ compared to the critical value 0^ (s, m, n) at the ,lflvel 

OE found from the Heck charts* Alternatively, defining A as 

A - \y(x^x^ r - - ^ : ^. - * 

Bartlett's criterion 

xj - - [(N-r-1) - (p + q + l)/2] log A 'v (pq) 



may be used. 



\ 

Some familiar univariate results are evident from the multivariate 



procedure. If p « 1, the partial canonical correlation coefficient becomes 
the partial multiple correlatif.on coefficient (see Rao, 1973, p. 268). 

Setting p q " 1, the test of partial Independence reduces to testing 

\ 

\ 

»ol ^2.3-° ' , ^ 

^1- ^ 12.3 ^ " 



which is tested using the familiar t statistic, t » r,^ U Tn-T / >/l-r?. 



1243 " ' '12.3 



\ where rj^2 3 the sample partial correlation coefficient (see Anderson, 
1958, p. 84), 

^^12 " .^13 ^23 



^12.3 



yi-r^3 /l-r 



2 

23 



Following Fisher (1924), to test that a partial correlation coefficient 
is equal to zero under normality, the t statistic for testing that a simple 
-correlation coefficient is equal to zero modified by subtracting one 
degree of freedom from the degrees of freeijom for error for every variate 
removed and the simple correlation coefficient is replaced by a partial 
correlation coefficient. Extending this rulf to the partial canonical 
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correlation analysis procedure, we were ab-le -to obtain, by- analogue, a 

/ 

test for multivariate partial independence* 

4./ Part and Bipartial Canonical Correlation Analysis 
Following Peaitson, partial correlation coefficients under normality 
^ are xio more than gimplp correlation coefficients in conditional dis- 

tributions. Holding several variates constant in a multivariate normal 

/ 

distribution allows one to investigate relationships between two variates 
..while coafetolling lor 4:he other variates which directly influence the 
two variates under study* Such an explanation of a partial correlation 
coefficient: would not satisfy most researchers* Alternatively we find people | 
^ay-tng ^hat a- simple. par*tlal ^correlation coefficient represents an estimate | 
of what a simple correlati|>n would be if we were able to calculate the 
sirt^le correlation coefficient at any one of several levels of a third 
-.-cidjced>a^^iahle*-> Thls.J^a stilL unsaXi&fa^ since we never check this when 

we use the coefficient. 

I Going back to the derivation or a partial correlation coefficient, we 
said^-that it ds the correlation in r^idua Is after linear effects due to a 
common varikte or set of variates is removed. Implied in this statement la 
the following\ausal relationship given by the causal sys^m: 

If Z does not influence the^ variation in X and^Y as^hown^bove.^the^ ^ 
interpretation of a partial correlation coefficient is unclear since by 
**partralling out" Z from X and Y we are removing the linear effect of Z on 
both X and Y 
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Provided Z influences both X and Y, interpretation of the partial 
correlation coefficient is meaningful. It would not make .ansa to cal- 
culate a partial correlation coefficient if Z influences X but not Y. 
For this situation we would have the following diasrao: 



For this case the cor^relation between X and Y is best estimated by controlling 
for the influence of Z on X. That is, we wa^it the correlation between Y and 
X prrtlalilng out z'^fwrn-T ai^dTot Y.^ Such'^a coVrelaTioiil^f fT^ntlT 
called a part correlation coefficient and for the three variate case is 
represented by . " ' 

' - — - ' ~ 1'^ "tlZ^'tlS , ' 

To derive ^^^2.3) following Yule, we assume a linear relationship between 

X and Z, X - o + BZ, and find the simple correlation betweenT - x"^ a -Tz " 

and Y. ; " 

As shown by McNemar (1969, p. 322) the test statistic for testing. 
^o'- ^ (2.3) - ° 

"r ^1(2.3) ° 

under joint normality is t - r^^2.3) - 3 / Vl-rJ^ ^ . Unfortunately, 
one may not merely substitute part correlations for partial correlations in 
the formula for testing f^^.j - 0^ to test thdt a part correlation coefficient. 
Is ^qual to zero. Since {^f^ j^< 9 as, may be seen by'examiring th^ 

formulas for these two coefficients, subs titutlnl- ^[^2.3} ^2 3 
t statistic for testing ? ^^^.3) ' ° y^^^^^ approxTm^te-tea^ocedure 
for testing part independence. 

r 13 



To extend~~ttre (lotion of part correlation to the multivariate case, 
we again assume that 

\ r 

AJ2 



p+q+r 











^ \ 






f^ll 


^12 


^13 




I- 


^21 


^22 


^23 






1^31 


^2 


^33/ 



Now however, we are inl:erested In assessing the degree of relationship 
between v and jjC after removing the linear effect of the variates in the / 

set from/^ and not That Is, we want to find linear combinations of 

thia^ variates- (g^ and U « j^*;)r and V = of unit variance such that the 

correlation between U and V is ma/xiroal. This is equivalent to maximizing 



^UV " ^1(2.3) 
a, D 

subject to the constraints " ^ fe' ^22 3 fe ** ^ where the 

matrix l^^2 3) defined by 

« [^11 ^12.3 

^(2.3) " ^l21.3 ^22.3; 

This again leads us to finding the roots and vectors of a determinantal 

equation, 

' ^12.3 ^22.3 ^21.3 ^ ^1(2.3) ^11 I' ° 
In tfddltion to part and partial correlation coefficients, other 



simple correlations are also in^ortant to the understanding of linear 
relationships among variates; for example, suppose two variates Z and 
W are highly correlated and that the causal relationship among four 
variates is as follows:- 



14 
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z 



JTo determine the cjprrelation^between Y and X in this^ case, the_lineax ^ 
influence of Z on t and of W on X is controlled by removing the influence 
of Z on Y and of « on X. This leads us to the bipartial correlation 
^oefy^ientL.. - ^ ^ — — - — — — — — • — 

^12 " ^13 ^23 " ^lA ^24 ^13 ^34 ^24 



(1.3)(2.4) 



V 



/l . P 2 

which reduces to a part correlation coefficient if either ^^^3 or equals zero. 
Alternatively if the relationship among the variates were given by the system 



the partial correlation coefficient 

^12.3 " "^14.3 ^24.3 



e 



12.34 
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w^ould. he xxf .later^t*.^:Eha.causal- relationships among variates :'.nfluences 
the researcher's selection of a correlation coefficient and hence the 
analysis of a set of data* 

^ .extwxd thauJLdea^^ a^ixipaiitial qoxrelation <io^fficient«^. fottP 

sets of variates we assume that 



^ - 



I 

U J 



p+q+r+t 



Ail 

^3 

Ai4 



^21 ^22 ^^23" ^24 



^31 ^32 ^33 ^34 
'^41 ^42 ^43 ^44/ 



and letting 
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'12 



'12 ■ ^13 ^3 Si " ^14 ^44 ^42 ^13 ^^33^ ^^34 ^44 ^42 



All.3 


^12 A 


* 




i^21 





we form the matrix 

^(1.3M2.4) 

which is the variance-coy4riance matrix of the residuals j^^ ~ ^ ^ 

= X - X where Y is found by regressing on Z and X is found by regressing 
]^ on Notice that if the third set of variates are not in our model 
that 3 J ^2 4) reduces to Z^^^ which is analogous to the univariate 
cBse. 

To assess the degree of relationship between ^ and jg^ we again want 
to maximize the correlation between U = and V « fe'/gx* "^^^ leads to 

the solution of the determinantal equation 

2 



(1.3)(2.A) ^11.3 I " ^ 



'12 ^22;^ 21 

The s - min (p, q) nonzero positive square roots ^^(1^3) (2 .4)* 
biDartial canonical correlations between the bipartial canonical variates 



, are the 



coefficients is given by 



, s. The relationship between the 



-1 * 
^22.4 ^21 ^ 



id. 3) (2.4) 



i *^ ly * m • f S« 



To te«t the hypothesis of bipartial independence 



H ! £,0=0 
o 12 



"l* ^12 ?t 0 
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we have for this test only an approximate procedure in the multivariate 
case. The parameters using Roy's criterion are given by 
s « min (p - i + 1, q - i + 1) 



^ „ N (r, t) - p - q - 2 



and the blpartial canonical correlations are compared to the critical 
values found in the appropriate Heck chart. 
Defining A as 

® 2 • 

A <^ - '^id.axz.A)^ 

1^1 

Bartlett*s criterion defined by 

<% 

» - ((N - max (r, t) - 1) - (p + q + l)/2], log A x (pq) 

might also be used. 

The approximate test of part independence for the case of multivariate 
pAXX^x^nQiiisAL^a^».is. f^lla^^A similar procedure with r replacing 
max (r, t) in the formulas and A replacing hypothesis. 

^ ^ u.-i.c^A... .Example- 1: Canonical Correlation Analysis 
Suppose a researcher was interested in investigating the relationship 
between three achievement variables Aj^, A2, and A^ and two personality 
..variables Pi and where the correlation matrix among the variates 
Y « (p^, P^ and X - {Aj^, A2, aJ is 
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R - / R^^ ^121 
^21 ^22y 



1.0000 
.7951 
.2617 

. .6720 
.3390 



1.0000 
.3341 
.5876 
.3404 



1.0000 
- .3703 
.2114 



1.0000 
.3548 



Solving the determlnantal equation 



-1 2 
^12 ^22 ^21 " ^. ^11 



. the_.9ample. <^atvQnlc|ilr, correlations,^^ are obt/ained from the roots t^* 



= .4746 
r, - .6889 



- .0375 
T2 ' .1936 



Rejecting the hypothesis of independence and finding that only the first 
root is significantly different from zero, the researcher at first 6lush 
might conclude that the two sets of variates are significantly related and 
that the proportion of variance in common to the two standardized canonical 



variates 



- .7752 Pj^ + .'2662 



Vj^ - .0520 Aj^ + .8991 A2-t-.1831 



Is ' .4746. However, further investigation into the significant 



canonical variates and the variates within a set would yield that the 

.883i 



corr (Zy, U^) = R^^ a^ 



'.424> 

corr (Z^, - - f .933 

.513; 



18 



which indicates that Pj^ and P2 are equally important to Uj^, but that A2 
is raore important to Vj^ than either Aj^ or Aj. Furthermore, 

„2 (J987)^^- (.883)^ ^ 

^ 2 

of the variance of \the, first set Is accounted for by and only 

\ 

v2 = iM)^ + (.983)^ + (.513)^ _ ^^^^ 
^ 2 



of the variance in the other set is accounted for by V.. Investigation 

\ 

of the correlations between the variates in one set and the significant 

canonical variates yields \ 

/•292' 

corr (ti U^) « corr (Z^, V^) « .677 

V353. 

corr (Y, V^) » corr (Z^, U^) = /.680^ 

1.608. 



This shows that A2 in the achievement set is influenced most by the 
personality canonical variate and that the achievement canonical variate is 
influenced equally by both personality variates. More specifically, 22% 
of the variance common to the achievement variates can be accounted for 
by a linear combination of personality variables. 



u2 _ (.292)^ + (.677)^ -f W.355)^ „ 



'XIU^ 3 



whereas the proportion of variance in- the personality variables accounted ' 
for by the achievement canonical variate is 42^, 

\ 

\ 
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y2 (.680)^ + (.608)^ ... 
Vy|„ » .416 



Ih summary, given the two sets of varlates 
Y - bi» and X - [k^, Aj, 

it appears that the proportion of variance "in common" to the two 
significant canonical variates id about A7Z. However, 88Z of the 
variance in the set ^ is accounted for by and only 422 of the variance 
in is accounted for by V^^. Similarly, 47% of the variance in the set 
\ is accounted for by V^^, but only 232 of the variance in ]^ is accounted 
for by the canonical variate Uy 
" Stewart and Love (1968) observed that 

w2 „2 2 

x|Uj^ X 1 

and termed these redundancy indexes since they better summarize the overlap 

between two §ets of variates than the square of a canonical correlation. 

2 2 
For our data ^y\v^ " '^^^ ^x|U " redundancy in 

\^ given ]^ is .416 and the redundancy,^ in given ^ is .226. The larger the 
redundancy Indexes » the larger the overlap among the variates in each domain* 
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Example 2: Bipartial Canonical Correlation Analysis 

Using a random sample of 502 twelfth-grade students from the project- 
Talent survey (supplied by William W. Coo ley at the University of 
^Ittsburgfa^), dat» vei>e coHect'ed- ^on< 11 tests: (1) general information 
test, part I, (2) general information test part II, (3) English, 
(4) reading comprehension, (5) creativity, (6) mechanical reasoning, 
'<-7>-abatract-^?eae<mittgr -(8>-tBathematic8, (9) sociability inventory, 
-<1Q) physical- science interes^u^inventory, and (11)- of f ice work interest — 



inventory. Knowing that verbal ability tests (3) through (5) are highly 

correlated with the nonverbal tests (6). through (8), the investigator 

was interested in investigating the relationship between the general 
information tests (1) and (2) and the interest inventory measures, tests 
— i^y through fll>. 5ince -prior knowledge would indicate that the relation- 
ships i^mong the sets are of the form 



highly 
correl 



-> X - {9, 10, ll] 



the set of data lends Itself to a bipartial canonical analysis. The 
correlation matrix for the sets of variates |.s shown in Table 1. 

Table 1. Original Variate Intercorrelatioh Matrix 
y X Z 



w 



1.000 i 

Y .861 1.000 1 


1 

i , 




-.011 .062 1 1.000 
X . 573 .397 1 .055 1.000 

-. 349 -.234} . 084 -.246 1.000 






T 

.492 .550 1 .083 .094 .109 
-jT" '698 . 765| . 021 .275 -.087 
.644 . 621 . 001 .340 -. 119 


1.000 

.613 1.000 

.418 . 595 1.000 




1 -} 1 

.661 .519 j -.•75 .. .531 -.364 
W . 487 .469] .007 . 202 -.079 
. 761 .649^ .030 .500 -.191 


.160 .413 .522 
. 456 . 530 . 433 
.566 .641 . 556 


1.000 
.451 1.000 
. 547 .517 1.000 
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4 



Using the CANON computer program described In Section 8 we find that the 
matrix of partial variances and covarlances~ Is 



I 



^(Y.Z)(X.W)j ^X-W 



.42^ .263" I -^^2 rl33^.163— I 
.263 .365 j .051 .060 -.110 
J. 

-.012 .051 j .987 .076 .054 
.133 . 060 I .076 .635 -.041 
-. 163 -. 110 } .054 -.041 . 858 

and the eigenvalues of the determlnantal equation 

'^(Y-Z)(X'W)^X?^W^(Y.Z)(X-W) " ^(Y-Z)(X-W)^Y-Z I ' ° 
a_re .133 and .022. Using B.oj_^s criter ion for the fijflt laittt-HS-iavt^ « m " 
and n ■ 247.5 and using the Heck (1960) charts we find that this root differs 
from zero at the .01 level. Similarly for the second root s > 1, m > 0 and / 



n » 247.5. Mheti s =• 1 we calculate the J^-statlstlc „ 

2 

F » 



n+1 



m+1 



Hi 



(see, for example, Morrison, 1967, p. 166-167) and the test statistic dis- 
tribution is . ^ « . For our data 
2iirf2, 2n+2 



F « 



248,5 / .022 



5.590 



^ 1 / \ .978 j 

Referring to tables of F we find that the second root also differs t^cm zero 
at the .01 level. 

The bipartisl canonical correlation coefficients are .364 and .150. Using 
Bartlett*s approximate chi squared test we find that the hypothesis of bipartlal 
independence is rejected for both roots (chi squared' « 81.676, df « 6, p < .0001) 
and also for the second rQot afte^r having removed the first root (chi squared » 
11.223, df « 2, p « «00367)t Thus we reach the same conclusions using Bartlett^s 
test as we do using Roy*8. 
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The standardized canonical varia&es are 



Uj^ - 1.67AYj^ - 0.255Y2 



U2 « -l.lSOYj^ + 2.205Y2 



« -0.120Xj^ + 0.863X2 - 0.737X3 
V2 - 0.919Xj^ .- 0.397X2 - O.A6IX3 

and the correlation coefficients. befw^^^Ii^ the original and canonical variates 
are shown in Table 2. ' 

Table 2. Original Variate-Partial Canonical Variate Correlations 



X. 



.,"1 
.993 

.115 
-.034 

.260 
-.265 



"2 - 
.576 

.817 

.128 

-.031 

-.053 



1 

.362 
.210 



-.093 
.715 
-.728 



'2 

.017 
.122 
.858 
-.205 
-.356 



Examination of these correlation coefficients helps to understand the relation- 
ships existing among the original variates and the partial canonical variates. 
The printout from the CANON program also Indicates that accounts for .66 
of the Y-'set variance and U2 accounts for .34. Similarly accounts for .35 
of the X-set variance and V2 accounts for .30. ^ 

The redundancies, or proportions of variance In the Y-set and X-set that 
are accounted for by the significant canonical variates derived from the opposite 
sets are shown in Table 3. 





Table 3. 


Redundancies 






\ 


^2 


Overall 


Y-8et 


.087 ; 


.008 


.095 




"l 




Overall 


X-set 


.046 


.007 


.053 
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Jhese data indicate that although there are significant relationships 
between the information tests and interest inventories after partialing out 
verbal ability and nonverbal ability tneasures, respectively, the proportions 
-of accounted for vari-an^e-^^e ra^thett snail. Examining the correlations in 
Tables 2 and 3 ve see that the strongest relationship is between and the 
Y-set, and that and contribute most to Vj^, X^^ being almost uncorrelated 
-*wttdt>Vj^.- > The next strongest r^elationshlp ia between and the X-set, with 
Yj^ contributing much more to U^, than does 

7. The CANON Computer Program 
The CANON computer program allows the researcher to analyze multivariate 

data by any of the four techniques discussed in this paper: Canonical Analysis, 

> 

Partial Canonical Analysis, Part Canonical Analysis, and Bipartial Canonical^ 

* I 

Analysis. 

The user may input raw data, a variance-covariance matrix, or a 
correlation matrix, and specifies the type of analysis and number of variates 
in each set. The first two sets of variates are referred to as the Y-set 
and the X'-set and are the sets whose relationship is to be studied. The third 
set (Z), if used, contains the variates to be partialed out of the Y-set 
and the X-set in partial canonical analysis, the Y-set or the X-set in part 
canonical analysis or the Y-set in bipa^^ial canonical analysis. The fourth 
s et (W) contains the variates to be pattialed out of the X-set in bipartial 

canonical analysis. / 

/ 

.... 

The number of variates i"^ the Y-set must be less than o r equal to the 



number of variates in the X-set * Also, the variates must be input in the 
following order: Y-set, X-set, Z-set, W-set. 



Bie program is written 4n FORTRAN W for a DEC PDF 10. Ali- calcmiations 
are done using double precision* Conversio'^ of the program for other com* 
puter systems should not be difficult* Since the program s tores'* "PROBL" 

-^nd >-'*IilIS" itt- tswo ^ingl^-^predsiofMftefDory' locations and checks thelfirst five 
characters of the title and finish cards with the contents of these locations, 
changes will be necessary for ccnnputers that do iiot store 5 alphanumeric 

^«ichfiHMi«ter«-iki>^ a sitigle-ipreciHrton -iWBtooty^ Similar^ chdflges yJ/Tll 

necessary f or| some of the labels for the outputl which are also stored in 

^jnemory via DATA statements. These changes may be the only changes required 

- for many- computers- but the tisier "shmild check that the names of FORTRAN-supplied 
functions used in the program correspond to those available on the available 
system. Listings of th programs and card decks are available upon request 
from the authors. 
INPUT TO CANON 

The input to the program Is as follows: 

(a) Title Card 

The title card contains the characters PROBL in columns 1 through 5 
and any title that the user chooses in columns 6 through 80. 

(b) Problem Card 

The second card contain^ 9 numbers specifying the nature of the 
problem and type of analysis^. The first 8 numbers are integers and each 
• ' is punched In a^S-digit field, right Justified. The 9th number is a 
significance level to be used as a criterion for defining significant 
canonical relationships, according to Bartlett's test, and is a 4-digit 
dfeciJnal fraction punchea" with"a decfmal point. The nurobers^in tfils car3 
are: 



V 
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Ct)l. 1-5 « No. of observations in the sainple 



Col. 


6-10 


NP - 


no. of varlates 


in 


the Y-set 




Col. 


11-15 


NQ - 


No. of varlates 


in 


the X-set 


(NP<NQ) 


-Col. 


16-20 


' NR » 


No. of varlates 


in 


the Z-set 


(punch zero or 








leave blank if no Z-set) 




Col. 


21-25 


NT - 


No. of varlates 


in 


the W-set 


(punch zero or 



leave blank if no W-set) 

Col. 26-30 Punch 1 if Canonical Analysis 

Punch 2 if Partial Canonical Analysis 
^ . Punch 3 if Part Canonical Analysis, Partialing 

Z-set from Y-set 
Punch 4 if Part Canonical Analysis, Partialing 

Z-set from X-set 
Tunch 5 if Bipartial Canonical Analysis 

Col. 3X-35 NRMC « No. of format cards*^ 

Col. 36-40 Punch 0 or leave blank if raw data to be input 
Punch 1 if variance-covariance or correlation 
matrix to be input 

Col. 41-45 PIN « significance level for retention of canonical 

variates according to the Bartlett test. 
Punched with decimal point. Punch 1.0 
if it is desired to have all possible canonical 
variates extracted. 

(c) Format Card 

The input format contains one F-field for each variate that is / 
input. The user should remember the order in which the variate sets 
must be input, as sp.ecified below. 

(d) Data ^ 

The data may be input in raw form (IN»zero) or in the form of a 
variance-covariance or correlation matrix (IN-one). 

(i) Raw Data:, The values on the variates for each observation are 
input in a single record containing one or more cards. The order 
of input TDUst be: Y-set variates, X-set variates, Z-set (if used) 
vaxiates,, W-set (if used) variates. The variates are punched ^ 



^ as specified on the variable format card, card c. 
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(ii) Varlance-covariance or Correlation Matrix: The complete 

square symmetric matrix of variances and covariances or inter- 

correlations of all variates is input. The matrix must be^in 

^ ^ m 

the form: 





S(Y,X) 


S(Y,Z) 


S(Y,W) 


S(X,Y) 


' S(X,X) 


S(X',Z) 


S(X,W) 


S(Z,Y) 


S(Z,X) 


S(Z,Z) 


S(Z,W) 


S(W,Y) 


S(W,X) 


S(W,Z) 


S(W,W) 



'V'here-S(Apr^> represeitts -thfr varlaric e- uovi ^riarvce matrix or correlation matrix 
of variate-set A with var late-set B. The mwber of variates in the Y-set must 
be less than or equal to the nu>.iber in the X-set . 

Bie -values -In -each row/of^ the -matrix are input in one record containing 

one or more cards, punched as specified in the variable format card, card c. 
(e) End of Job Card 



" -The -program allows the \iser to stack jobs to be run sequential 




each job containing a complete set of cards a through d. Thus if a second 
job is to be run, a second title, problem, etc. card follows the data from 
* the first Job; The data for the last jtob is followed by a end-of-job card 
which contains the characters "FINIS" in columns 1 through 5. 
OUTPUT FROM CANON 

^ ^ ThA output fiTcJm CANON includes the follcwirtg (all values are printed 
in scientific notation; eg. .1234 D-01 - .1234 x 10*"^ « .P123<>) : 

(a) Variance-^ccvariance matrix (or correlation matrix when it is input) 
of all variates. 

(b) Standard deviations of all variates, by set 

(c) Variance-covar lance matrix after partialing. Output when the analysis 
is a partldl, partdr bipaftial canonical analysis, this matrix contains 
the variances and covariances of the Y and.X sets a,fter partialing. 
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(4) Eigenvalues .f*oin 4:h»»<Jete«ninan«al eqaatioir formed for-the atlalyslRg 
and the values necessary for determining significance by Roy's criterion 
using the Heck charts. 

Canonical, Pari: taL canonical » Part canonical or Bipartial canonical 
correlation coefficients and Bartlett's test for the significance of 
the coefficients. 

{4)- Stan<iardi»ed-*<:«nonical coefficients for the Y-set variates and 

correlation coefficients between the Y-set variates and canonical 

/ 

variates derived from the Y-set. 

-(frX^-StJandar^ieed- canonical cwrffictents ttsr thfSrX-s&r vaflSXes^d coF^ 
relation coefficients between the X-set variates and canonical variates 
derived from the X-set. 

(h) Proportions of variance iir the Y-set accounffed for by each ^ ^ 
significant (Bartlett's test) canonical variate derived from the Y-set, 
and the siinllar proportions for the X-set. 

(i) Correlation coefficients between Y-set variates ana the^ si^nTFicanT 
canonical variates derived from the X-set, along with the redundancy 
for each canonical variate and the overall redundancy. 

(j) Correlation coeffictBnts lmtweeff"X-ser var fates and tKe significant 
canonical variates derived from the Yr'set, along with the redundancies 
for each canonical variate and the overall redundancy. 



Canonical varlate& nor-nalized to have unit variance In the sample. 



/ 
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